Serveur d'exploration sur la TEI

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

SusTEInability of linguistic resources through feature structures

Identifieur interne : 000087 ( Main/Exploration ); précédent : 000086; suivant : 000088

SusTEInability of linguistic resources through feature structures

Auteurs : Andreas Witt [Allemagne] ; Georg Rehm [Allemagne] ; Erhard Hinrichs [Allemagne] ; Timm Lehmberg [Allemagne] ; Jens Stegmann [Allemagne]

Source :

RBID : ISTEX:9F3C60DBB95AD64EA616839B33A16ACA18E60DB9

Descripteurs français

English descriptors

Abstract

This article shows that the TEI tag set for feature structures can be adopted to represent a heterogeneous set of linguistic corpora. The majority of corpora is annotated using markup languages that are based on the Annotation Graph framework, the upcoming Linguistic Annotation Format ISO standard, or according to tag sets defined by or based upon the TEI guidelines. A unified representation comprises the separation of conceptually different annotation layers contained in the original corpus data (e.g. syntax, phonology, and semantics) into multiple XML files. These annotation layers are linked to each other implicitly by the identical textual content of all files. A suitable data structure for the representation of these annotations is a multi-rooted tree that again can be represented by the TEI and ISO tag set for feature structures. The mapping process and representational issues are discussed as well as the advantages and drawbacks associated with the use of the TEI tag set for feature structures as a storage and exchange format for linguistically annotated data.

Url:
DOI: 10.1093/llc/fqp024


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>SusTEInability of linguistic resources through feature structures</title>
<author wicri:is="90%">
<name sortKey="Witt, Andreas" sort="Witt, Andreas" uniqKey="Witt A" first="Andreas" last="Witt">Andreas Witt</name>
</author>
<author wicri:is="90%">
<name sortKey="Rehm, Georg" sort="Rehm, Georg" uniqKey="Rehm G" first="Georg" last="Rehm">Georg Rehm</name>
</author>
<author wicri:is="90%">
<name sortKey="Hinrichs, Erhard" sort="Hinrichs, Erhard" uniqKey="Hinrichs E" first="Erhard" last="Hinrichs">Erhard Hinrichs</name>
</author>
<author wicri:is="90%">
<name sortKey="Lehmberg, Timm" sort="Lehmberg, Timm" uniqKey="Lehmberg T" first="Timm" last="Lehmberg">Timm Lehmberg</name>
</author>
<author wicri:is="90%">
<name sortKey="Stegmann, Jens" sort="Stegmann, Jens" uniqKey="Stegmann J" first="Jens" last="Stegmann">Jens Stegmann</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:9F3C60DBB95AD64EA616839B33A16ACA18E60DB9</idno>
<date when="2009" year="2009">2009</date>
<idno type="doi">10.1093/llc/fqp024</idno>
<idno type="url">https://api.istex.fr/document/9F3C60DBB95AD64EA616839B33A16ACA18E60DB9/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000247</idno>
<idno type="wicri:Area/Istex/Curation">000247</idno>
<idno type="wicri:Area/Istex/Checkpoint">000056</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000056</idno>
<idno type="wicri:doubleKey">0268-1145:2009:Witt A:susteinability:of:linguistic</idno>
<idno type="wicri:Area/Main/Merge">000087</idno>
<idno type="wicri:source">INIST</idno>
<idno type="RBID">Francis:11-0223735</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000007</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000038</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000014</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">000014</idno>
<idno type="wicri:doubleKey">0268-1145:2009:Witt A:susteinability:of:linguistic</idno>
<idno type="wicri:Area/Main/Merge">000117</idno>
<idno type="wicri:Area/Main/Curation">000087</idno>
<idno type="wicri:Area/Main/Exploration">000087</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a">SusTEInability of linguistic resources through feature structures</title>
<author wicri:is="90%">
<name sortKey="Witt, Andreas" sort="Witt, Andreas" uniqKey="Witt A" first="Andreas" last="Witt">Andreas Witt</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Institut fr Deutsche Sprache, Mannheim</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Mannheim</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>vionto GmbH, Berlin</wicri:regionArea>
<placeName>
<region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Tbingen University, General and Computational Linguistics</wicri:regionArea>
<wicri:noRegion>General and Computational Linguistics</wicri:noRegion>
<wicri:noRegion>General and Computational Linguistics</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Hamburg University, SFB Multilingualism</wicri:regionArea>
<wicri:noRegion>SFB Multilingualism</wicri:noRegion>
<wicri:noRegion>SFB Multilingualism</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Bielefeld University, Faculty of Linguistics and Literary Studies</wicri:regionArea>
<wicri:noRegion>Faculty of Linguistics and Literary Studies</wicri:noRegion>
<wicri:noRegion>Faculty of Linguistics and Literary Studies</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Rehm, Georg" sort="Rehm, Georg" uniqKey="Rehm G" first="Georg" last="Rehm">Georg Rehm</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Institut fr Deutsche Sprache, Mannheim</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Mannheim</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>vionto GmbH, Berlin</wicri:regionArea>
<placeName>
<region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Tbingen University, General and Computational Linguistics</wicri:regionArea>
<wicri:noRegion>General and Computational Linguistics</wicri:noRegion>
<wicri:noRegion>General and Computational Linguistics</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Hamburg University, SFB Multilingualism</wicri:regionArea>
<wicri:noRegion>SFB Multilingualism</wicri:noRegion>
<wicri:noRegion>SFB Multilingualism</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Bielefeld University, Faculty of Linguistics and Literary Studies</wicri:regionArea>
<wicri:noRegion>Faculty of Linguistics and Literary Studies</wicri:noRegion>
<wicri:noRegion>Faculty of Linguistics and Literary Studies</wicri:noRegion>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Hinrichs, Erhard" sort="Hinrichs, Erhard" uniqKey="Hinrichs E" first="Erhard" last="Hinrichs">Erhard Hinrichs</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Institut fr Deutsche Sprache, Mannheim</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Mannheim</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>vionto GmbH, Berlin</wicri:regionArea>
<placeName>
<region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Tbingen University, General and Computational Linguistics</wicri:regionArea>
<wicri:noRegion>General and Computational Linguistics</wicri:noRegion>
<wicri:noRegion>General and Computational Linguistics</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Hamburg University, SFB Multilingualism</wicri:regionArea>
<wicri:noRegion>SFB Multilingualism</wicri:noRegion>
<wicri:noRegion>SFB Multilingualism</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Bielefeld University, Faculty of Linguistics and Literary Studies</wicri:regionArea>
<wicri:noRegion>Faculty of Linguistics and Literary Studies</wicri:noRegion>
<wicri:noRegion>Faculty of Linguistics and Literary Studies</wicri:noRegion>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Lehmberg, Timm" sort="Lehmberg, Timm" uniqKey="Lehmberg T" first="Timm" last="Lehmberg">Timm Lehmberg</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Institut fr Deutsche Sprache, Mannheim</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Mannheim</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>vionto GmbH, Berlin</wicri:regionArea>
<placeName>
<region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Tbingen University, General and Computational Linguistics</wicri:regionArea>
<wicri:noRegion>General and Computational Linguistics</wicri:noRegion>
<wicri:noRegion>General and Computational Linguistics</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Hamburg University, SFB Multilingualism</wicri:regionArea>
<wicri:noRegion>SFB Multilingualism</wicri:noRegion>
<wicri:noRegion>SFB Multilingualism</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Bielefeld University, Faculty of Linguistics and Literary Studies</wicri:regionArea>
<wicri:noRegion>Faculty of Linguistics and Literary Studies</wicri:noRegion>
<wicri:noRegion>Faculty of Linguistics and Literary Studies</wicri:noRegion>
</affiliation>
</author>
<author wicri:is="90%">
<name sortKey="Stegmann, Jens" sort="Stegmann, Jens" uniqKey="Stegmann J" first="Jens" last="Stegmann">Jens Stegmann</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Institut fr Deutsche Sprache, Mannheim</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Mannheim</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>vionto GmbH, Berlin</wicri:regionArea>
<placeName>
<region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Tbingen University, General and Computational Linguistics</wicri:regionArea>
<wicri:noRegion>General and Computational Linguistics</wicri:noRegion>
<wicri:noRegion>General and Computational Linguistics</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Hamburg University, SFB Multilingualism</wicri:regionArea>
<wicri:noRegion>SFB Multilingualism</wicri:noRegion>
<wicri:noRegion>SFB Multilingualism</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Bielefeld University, Faculty of Linguistics and Literary Studies</wicri:regionArea>
<wicri:noRegion>Faculty of Linguistics and Literary Studies</wicri:noRegion>
<wicri:noRegion>Faculty of Linguistics and Literary Studies</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Literary and Linguistic Computing</title>
<idno type="ISSN">0268-1145</idno>
<idno type="eISSN">1477-4615</idno>
<imprint>
<publisher>Oxford University Press</publisher>
<date type="published" when="2009-09">2009-09</date>
<biblScope unit="volume">24</biblScope>
<biblScope unit="issue">3</biblScope>
<biblScope unit="page" from="363">363</biblScope>
<biblScope unit="page" to="372">372</biblScope>
</imprint>
<idno type="ISSN">0268-1145</idno>
</series>
<idno type="istex">9F3C60DBB95AD64EA616839B33A16ACA18E60DB9</idno>
<idno type="DOI">10.1093/llc/fqp024</idno>
<idno type="ArticleID">fqp024</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0268-1145</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Corpus annotation</term>
<term>Feature structure</term>
<term>Linguistic resources</term>
<term>Markup language</term>
<term>TEI</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Annotation de corpus</term>
<term>Langage de balisage</term>
<term>Ressources linguistiques</term>
<term>Structure de traits</term>
<term>TEI</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract">This article shows that the TEI tag set for feature structures can be adopted to represent a heterogeneous set of linguistic corpora. The majority of corpora is annotated using markup languages that are based on the Annotation Graph framework, the upcoming Linguistic Annotation Format ISO standard, or according to tag sets defined by or based upon the TEI guidelines. A unified representation comprises the separation of conceptually different annotation layers contained in the original corpus data (e.g. syntax, phonology, and semantics) into multiple XML files. These annotation layers are linked to each other implicitly by the identical textual content of all files. A suitable data structure for the representation of these annotations is a multi-rooted tree that again can be represented by the TEI and ISO tag set for feature structures. The mapping process and representational issues are discussed as well as the advantages and drawbacks associated with the use of the TEI tag set for feature structures as a storage and exchange format for linguistically annotated data.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Allemagne</li>
</country>
<region>
<li>Bade-Wurtemberg</li>
<li>Berlin</li>
<li>District de Karlsruhe</li>
</region>
<settlement>
<li>Berlin</li>
<li>Mannheim</li>
</settlement>
</list>
<tree>
<country name="Allemagne">
<region name="Bade-Wurtemberg">
<name sortKey="Witt, Andreas" sort="Witt, Andreas" uniqKey="Witt A" first="Andreas" last="Witt">Andreas Witt</name>
</region>
<name sortKey="Hinrichs, Erhard" sort="Hinrichs, Erhard" uniqKey="Hinrichs E" first="Erhard" last="Hinrichs">Erhard Hinrichs</name>
<name sortKey="Hinrichs, Erhard" sort="Hinrichs, Erhard" uniqKey="Hinrichs E" first="Erhard" last="Hinrichs">Erhard Hinrichs</name>
<name sortKey="Hinrichs, Erhard" sort="Hinrichs, Erhard" uniqKey="Hinrichs E" first="Erhard" last="Hinrichs">Erhard Hinrichs</name>
<name sortKey="Hinrichs, Erhard" sort="Hinrichs, Erhard" uniqKey="Hinrichs E" first="Erhard" last="Hinrichs">Erhard Hinrichs</name>
<name sortKey="Hinrichs, Erhard" sort="Hinrichs, Erhard" uniqKey="Hinrichs E" first="Erhard" last="Hinrichs">Erhard Hinrichs</name>
<name sortKey="Lehmberg, Timm" sort="Lehmberg, Timm" uniqKey="Lehmberg T" first="Timm" last="Lehmberg">Timm Lehmberg</name>
<name sortKey="Lehmberg, Timm" sort="Lehmberg, Timm" uniqKey="Lehmberg T" first="Timm" last="Lehmberg">Timm Lehmberg</name>
<name sortKey="Lehmberg, Timm" sort="Lehmberg, Timm" uniqKey="Lehmberg T" first="Timm" last="Lehmberg">Timm Lehmberg</name>
<name sortKey="Lehmberg, Timm" sort="Lehmberg, Timm" uniqKey="Lehmberg T" first="Timm" last="Lehmberg">Timm Lehmberg</name>
<name sortKey="Lehmberg, Timm" sort="Lehmberg, Timm" uniqKey="Lehmberg T" first="Timm" last="Lehmberg">Timm Lehmberg</name>
<name sortKey="Rehm, Georg" sort="Rehm, Georg" uniqKey="Rehm G" first="Georg" last="Rehm">Georg Rehm</name>
<name sortKey="Rehm, Georg" sort="Rehm, Georg" uniqKey="Rehm G" first="Georg" last="Rehm">Georg Rehm</name>
<name sortKey="Rehm, Georg" sort="Rehm, Georg" uniqKey="Rehm G" first="Georg" last="Rehm">Georg Rehm</name>
<name sortKey="Rehm, Georg" sort="Rehm, Georg" uniqKey="Rehm G" first="Georg" last="Rehm">Georg Rehm</name>
<name sortKey="Rehm, Georg" sort="Rehm, Georg" uniqKey="Rehm G" first="Georg" last="Rehm">Georg Rehm</name>
<name sortKey="Stegmann, Jens" sort="Stegmann, Jens" uniqKey="Stegmann J" first="Jens" last="Stegmann">Jens Stegmann</name>
<name sortKey="Stegmann, Jens" sort="Stegmann, Jens" uniqKey="Stegmann J" first="Jens" last="Stegmann">Jens Stegmann</name>
<name sortKey="Stegmann, Jens" sort="Stegmann, Jens" uniqKey="Stegmann J" first="Jens" last="Stegmann">Jens Stegmann</name>
<name sortKey="Stegmann, Jens" sort="Stegmann, Jens" uniqKey="Stegmann J" first="Jens" last="Stegmann">Jens Stegmann</name>
<name sortKey="Stegmann, Jens" sort="Stegmann, Jens" uniqKey="Stegmann J" first="Jens" last="Stegmann">Jens Stegmann</name>
<name sortKey="Witt, Andreas" sort="Witt, Andreas" uniqKey="Witt A" first="Andreas" last="Witt">Andreas Witt</name>
<name sortKey="Witt, Andreas" sort="Witt, Andreas" uniqKey="Witt A" first="Andreas" last="Witt">Andreas Witt</name>
<name sortKey="Witt, Andreas" sort="Witt, Andreas" uniqKey="Witt A" first="Andreas" last="Witt">Andreas Witt</name>
<name sortKey="Witt, Andreas" sort="Witt, Andreas" uniqKey="Witt A" first="Andreas" last="Witt">Andreas Witt</name>
<name sortKey="Witt, Andreas" sort="Witt, Andreas" uniqKey="Witt A" first="Andreas" last="Witt">Andreas Witt</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Ticri/explor/TeiVM2/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000087 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000087 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Ticri
   |area=    TeiVM2
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:9F3C60DBB95AD64EA616839B33A16ACA18E60DB9
   |texte=   SusTEInability of linguistic resources through feature structures
}}

Wicri

This area was generated with Dilib version V0.6.31.
Data generation: Mon Oct 30 21:59:18 2017. Site generation: Sun Feb 11 23:16:06 2024